Task-Parallel Reductions in OpenMP and OmpSs
نویسندگان
چکیده
The wide adoption of parallel processing hardware in mainstream computing as well as the raising interest for efficient parallel programming in the developer community increase the demand for parallel programming model support for common algorithmic patterns. In this paper we present an extension to the OpenMP task construct to add support for reductions in while-loops and general-recursive functions. Further we evaluate its implications on the OpenMP standard and present a prototype implementation in OmpSs. Application scalability is achieved through runtime support with static and on-demand thread-private storage allocation. Benchmark results confirm scalability on current SMP systems.
منابع مشابه
Implementation of an Energy-Aware OmpSs Task Scheduling Policy
The OmpSs programming model supports task-based parallelism in a similar manner to OpenMP. This whitepaper explores the possibility of implementing an energy-aware scheduling policy in run-time component of the OmpSs programming model, to adapt task execution schedules for balancing energy efficiency with parallel performance. A high-level design description of a run-time scheduling plugin to a...
متن کاملEvaluating the Impact of OpenMP 4.0 Extensions on Relevant Parallel Workloads
OpenMP has been for many years the most widely used programming model for shared memory architectures. Periodically, new features are proposed and some of them are finally selected for inclusion in the OpenMP standard. The OmpSs programming model developed at the Barcelona Supercomputing Center (BSC) aims to be an OpenMP forerunner that handles the main OpenMP constructs plus some extra feature...
متن کاملScaling Irregular Array-type Reductions in OmpSs
Array-type reductions represent a frequently occurring algorithmic pattern in many scientific applications. A special case occurs if array elements are accessed in a non-linear, often random manner, which makes their concurrent and scalable execution difficult. In this work we present a new approach that consists of languageand runtime support to facilitate programming and delivers high scalabi...
متن کاملEfficient Programming for Multicore Processor Heterogeneity: OpenMP versus OmpSs
ARM single-ISA heterogeneous multicore processors combine high-performance big cores with power-efficient small cores. They aim at achieving a suitable balance between performance and energy. However, a main challenge is to program such architectures so as to efficiently exploit their features. In this paper, we study the impact on performance and energy trade-offs of single-ISA architecture ac...
متن کاملCoarse-Grain Performance Estimator for Heterogeneous Parallel Computing Architectures like Zynq All-Programmable SoC
Heterogeneous computing is emerging as a mandatory requirement for power-efficient system design. With this aim, modern heterogeneous platforms like Zynq All-Programmable SoC, that integrates ARM-based SMP and programmable logic, have been designed. However, those platforms introduce large design cycles consisting on hardware/software partitioning, decisions on granularity and number of hardwar...
متن کامل